Anticipatory Learning Classifier Systems and Factored Reinforcement Learning
نویسندگان
چکیده
Factored Reinforcement Learning (frl) is a new technique to solve Factored Markov Decision Problems (fmdps) when the structure of the problem is not known in advance. Like Anticipatory Learning Classifier Systems (alcss), it is a model-based Reinforcement Learning approach that includes generalization mechanisms in the presence of a structured domain. In general, frl and alcss are explicit, stateanticipatory approaches that learn generalized state transition models to improve system behavior based on model-based reinforcement learning techniques. In this contribution, we highlight the conceptual similarities and differences between frl and alcss, focusing on the one hand on spiti, an instance of frl method, and on alcss, macs and xacs, on the other hand. Though frl systems seem to benefit from a clearer theoretical grounding, an empirical comparison between spiti and xacs on two benchmark problems reveals that the latter scales much better than the former when some combination of state variables do not occur. Based on this finding, we discuss the mechanisms in xacs that result in the better scalability and propose importing these mechanisms into frl systems.
منابع مشابه
Anticipations Control Behavior: Animal Behavior in an Anticipatory Learning Classifier System
The concept of anticipations controlling behavior is introduced. Background is provided about the importance of anticipations from a psychological perspective. Based on the psychological background wrapped in a framework of anticipatory behavioral control, the anticipatory learning classifier system ACS2 is explained. ACS2 learns and generalizes on-line a predictive environmental model (a model...
متن کاملLearning Classifier Systems using the Cognitive Mechanism of Anticipatory Behavioral Control
A classifier system is a machine learning system that learns a collection of rules, called classifiers. Mostly, classifiers can be regarded as simple stimulus-response rules. A first level of learning called credit assignment level, consists of reinforcement learning on these classifiers. A classifier is reinforced in dependence on the result of an interaction between the CS and its environment...
متن کاملThe Introduction of a Heuristic Mutation Operator to Strengthen the Discovery Component of XCS
The extended classifier systems (XCS) by producing a set of rules is (classifier) trying to solve learning problems as online. XCS is a rather complex combination of genetic algorithm and reinforcement learning that using genetic algorithm tries to discover the encouraging rules and value them by reinforcement learning. Among the important factors in the performance of XCS is the possibility to...
متن کاملThe Introduction of a Heuristic Mutation Operator to Strengthen the Discovery Component of XCS
The extended classifier systems (XCS) by producing a set of rules is (classifier) trying to solve learning problems as online. XCS is a rather complex combination of genetic algorithm and reinforcement learning that using genetic algorithm tries to discover the encouraging rules and value them by reinforcement learning. Among the important factors in the performance of XCS is the possibility to...
متن کاملImproving MACS Thanks to a Comparison with 2TBNs
Factored Markov Decision Processes is the theoretical framework underlying multi-step Learning Classifier Systems research. This framework is mostly used in the context of Two-stage Bayes Networks, a subset of Bayes Networks. In this paper, we compare the Learning Classifier Systems approach and the Bayes Networks approach to factored Markov Decision Problems. More specifically, we focus on a c...
متن کامل